Tied Spatial Transformer Networks for Character Recognition
نویسندگان
چکیده
This paper reports a new approach applied to convolutional neural networks (CNNs), which uses spatial transformer networks (STNs). It consists in training an architecture which combines a localization CNN and a classification CNN, for which most of the weights are tied, which from here on we will name Tied Spatial Transformer Networks (TSTNs). The localization CNN is used for predicting the best affine transform for the input image, which is then processed according to the predicted parameters and passed through the classification CNN. We have conducted initial experiments on the cluttered MNIST dataset, comparing the TSTN and Spatial Transformer Networks (STN) with untied weights, as well as the classification CNN only. In all these cases, we obtain better results using the TSTN architecture. RÉSUMÉ. Cet article présente une nouvelle approche appliquée aux réseaux de neurones convolutionnels (RNC), qui utilise les réseaux de transformations spatiales (RTS). L’approche consiste à construire une architecture combinant un RNC pour la localisation et un RNC pour la classification. Bien que les deux réseaux soient dédiés à des taches différentes, la majorité de leurs poids sont partagées. Par la suite nous appelons ce type de réseaux réseaux de transformations spatiales liées ou RTSL. Le RNC de localisation est utilisé pour prédire la meilleure transformée affine que l’on peut appliquer sur l’image d’entrée, qui est ensuite traitée selon les paramètres prédits et passée au RNC de classification, qui fournit la prédiction. Nous avons mené des expériences initiales sur l’ensemble de données cluttered-MNIST comprenant des bruits additionnels. Nous comparons le RTSL et un RTS avec poids non liés, ainsi qu’avec le RNC de classification seul. Dans tous les cas, de meilleurs résultats sont obtenus en utilisant l’architecture RTSL proposée.
منابع مشابه
Gradient-based Learning Applied to Document Recognition
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper r...
متن کاملGradient - Based Learning Applied
| Multilayer Neural Networks trained with the backpropa-gation algorithm constitute the best example of a successful Gradient-Based Learning technique. Given an appropriate network architecture, Gradient-Based Learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns such as handwritten characters , with minimal preprocessing. This pape...
متن کاملTO PROCEEDINGS OF THE IEEE , 1998 1 Gradient - Based Learning Applied to DocumentRecognitionYann
| Multilayer Neural Networks trained with the backpropa-gation algorithm constitute the best example of a successful Gradient-Based Learning technique. Given an appropriate network architecture, Gradient-Based Learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns such as handwritten characters , with minimal preprocessing. This pape...
متن کاملPROC OF THE IEEE NOVEMBER Gradient Based Learning Applied to Document Recognition
Multilayer Neural Networks trained with the backpropa gation algorithm constitute the best example of a successful Gradient Based Learning technique Given an appropriate network architecture Gradient Based Learning algorithms can be used to synthesize a complex decision surface that can classify high dimensional patterns such as handwritten char acters with minimal preprocessing This paper revi...
متن کاملSTN-OCR: A single Neural Network for Text Detection and Text Recognition
Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been proposed. In this paper we present STN-OCR, a step towards semi-supervised neural networks for scene text recognition that can be optimized end-to-end. In c...
متن کامل